CLARO: Modeling and Processing of High-Volume Uncertain Data Streams

نویسندگان

  • Thanh Tran
  • Liping Peng
  • Boduo Li
  • Yanlei Diao
  • Anna Liu
چکیده

Uncertain data streams, where data is incomplete, imprecise, and even misleading, have been observed in a variety of environments. Feeding uncertain data streams to existing stream systems can produce results of unknown quality, which is of paramount concern to monitoring applications. In this paper, we present the Claro system that supports uncertain data stream processing for data that is naturally captured using continuous random variables. The Claro system employs a unique data model that is flexible and allows efficient computation. Built on this model, we develop evaluation techniques for complex relational operators, including aggregates and joins, by exploring advanced statistical theory and approximation techniques. Our evaluation results show that our techniques can achieve high performance in stream processing while satisfying accuracy requirements, and these techniques significantly outperform a state-of-the-art sampling-based method. Furthermore, initial results of a case study show that our modeling and aggregation techniques can allow a tornado detection system to produce better quality results yet with lower execution time.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Mining Frequent Patterns in Uncertain and Relational Data Streams using the Landmark Windows

Todays, in many modern applications, we search for frequent and repeating patterns in the analyzed data sets. In this search, we look for patterns that frequently appear in data set and mark them as frequent patterns to enable users to make decisions based on these discoveries. Most algorithms presented in the context of data stream mining and frequent pattern detection, work either on uncertai...

متن کامل

Statistical Modeling of Sensor Data and its Application to Outlier Detection

Various applications rely on a continuous processing of data streams originating from a network of interconnected and collaborated sensors. The processing of those streams has turned out to be a difficult task as sensors only have limited resources and the data they produce is inherently uncertain and unreliable. In order to bridge the gap from raw, uncertain sensor readings to a meaningful mod...

متن کامل

FPGA Implementation of JPEG and JPEG2000-Based Dynamic Partial Reconfiguration on SOC for Remote Sensing Satellite On-Board Processing

This paper presents the design procedure and implementation results of a proposed hardware which performs different satellite Image compressions using FPGA Xilinx board. First, the method is described and then VHDL code is written and synthesized by ISE software of Xilinx Company. The results show that it is easy and useful to design, develop and implement the hardware image compressor using ne...

متن کامل

Complex Event Processing over Distributed Uncertain Event Streams

In the 21st century, as technologies of perceptual recognition develops, devices of information generation begin to accurately sense, measure and monitor the physical world in real time.Complex Event Processing(CEP), which can be used to extract user level information from raw data, becomes the key part of the IoT middleware. Most of the current study of complex event processing has not focus o...

متن کامل

Modeling and Querying Data Series and Data Streams with Uncertainty

Many real applications consume data that is intrinsically uncertain and error-prone. An uncertain data series is a series whose point values are uncertain. An uncertain data stream is a data stream whose tuples are existentially uncertain and/or have an uncertain value. Typical sources of uncertainty in data series and data streams include sensor data, data synopses, privacy-preserving transfor...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009